feat(desktop): stage latest runtime#466
Conversation
Greptile SummaryThis PR introduces
Confidence Score: 4/5The staging implementation is self-consistent and well-guarded, but the open issues from previous review threads — the global partial sweep under concurrent staging calls, and the incomplete already-staged integrity check — remain unaddressed and affect the new code paths introduced here. The two prior-thread findings are real defects in the code landing in this PR: cleanupPartials deletes all in-flight partials regardless of version (creating a silent corruption window under concurrency), and already-staged can return prematurely when node-pty is absent from an otherwise valid staged directory. No new blocking issues were found beyond those threads, but their presence in this exact code keeps confidence from a clean pass. packages/desktop/src/runtime-update.ts — the staging loop between cleanupPartials, the concurrent partial sweep, and the already-staged guard deserve attention before merge.
|
| Filename | Overview |
|---|---|
| packages/desktop/src/runtime-update.ts | New staging function: fetches latest kanban version via pacote, extracts to a partial dir, copies bundled node-pty, then atomically renames to the final versioned directory and writes the pointer. Staging logic is sound, but cleanupPartials sweeps all in-flight partials globally (concurrency risk flagged in prior review), and the already-staged guard only checks dist/cli.js presence, not node-pty (also flagged). No new blocking issues found. |
| packages/desktop/test/runtime-update.test.ts | Unit tests covering up-to-date, already-staged, bad-version, non-semver guards, and three staging scenarios (success, missing node-pty, missing cli.js, stale-partial recovery). pacote is fully mocked. The stale-partial recovery test omits an assertion that the partial dir was actually removed (flagged in prior review); the non-semver currentVersion test only checks outcome.kind, not the written pointer state. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[checkAndStageLatestRuntime] --> B[pacote.manifest kanban@latest]
B --> C{semver.valid?}
C -- No --> ERR1[throw non-semver error]
C -- Yes --> D{latest <= currentVersion?}
D -- Yes --> R1[return up-to-date]
D -- No --> E{isBadVersion?}
E -- Yes --> R2[return bad-version]
E -- No --> F{resolvePointerCliEntry === latest?}
F -- Yes --> R3[return already-staged]
F -- No --> G[cleanupPartials — sweep *.partial dirs]
G --> H[rm stage.partial — idempotent]
H --> I[mkdir versions/]
I --> J[pacote.extract kanban@version into stage.partial]
J --> K[cp bundled node-pty into stage.partial/node_modules/]
K --> L{dist/cli.js exists?}
L -- No --> ERR2[throw missing dist/cli.js]
L -- Yes --> M[rm finalDir — remove previous version]
M --> N[rename stage.partial to finalDir — atomic on same FS]
N --> O[writePointer — sync atomic JSON write]
O --> R4[return staged]
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
packages/desktop/src/runtime-update.ts:93-100
**Orphaned `finalDir` if `writePointer` throws after `rename`**
After `rename(stage, finalDir)` completes, the staged runtime lives in `finalDir` but is not yet pointed to by `current.json`. If `writePointer` throws (e.g. disk-full mid atomic-write), the function propagates the error and `finalDir` is left as an unreferenced, finalized-looking directory — not a `.partial` — so `cleanupPartials` on the next call will skip it. The next call will then hit `await rm(finalDir, …)` and wipe it before re-staging, which is correct but costs a full extra download. This is a narrow window and self-heals, but noting it here since the file header says "failures *before* the pointer write leave the existing pointer untouched" — a failure *inside* `writePointer` lands in a slightly different state (orphaned `finalDir`, pointer possibly half-written).
### Issue 2 of 2
packages/desktop/test/runtime-update.test.ts:140-158
**`non-semver currentVersion` test asserts only `outcome.kind`, not the written pointer**
Every other successful-staging test verifies that `readPointer(userData)` was written correctly and that the final directory layout is intact (e.g. `node_modules/node-pty` present, no `.partial` remnants). This test only checks `outcome.kind === "staged"`, so a regression that stages the right kind but writes a corrupt or missing pointer would pass undetected. Adding `expect(readPointer(userData)?.version).toBe("1.0.0")` would bring it in line with the main staging test.
Reviews (2): Last reviewed commit: "feat(desktop): stage latest runtime" | Re-trigger Greptile
| // `already-staged` requires both a pointer at `latest` AND its | ||
| // `cliEntry` actually present on disk. Without the file-exists | ||
| // check this gate would silently lie when the version dir was | ||
| // wiped (corrupt userData, manual cleanup, partial uninstall), | ||
| // leaving the user's runtime in a state where loadOverride keeps | ||
| // returning null *and* the updater keeps short-circuiting on | ||
| // "already-staged" forever. | ||
| const stagedCli = resolvePointerCliEntry(opts.userData); | ||
| if (stagedCli && versionFromCliEntry(opts.userData, stagedCli) === latest) { | ||
| return { kind: "already-staged" }; | ||
| } |
There was a problem hiding this comment.
already-staged integrity check only covers dist/cli.js, not node_modules/node-pty
resolvePointerCliEntry returns non-null only when dist/cli.js is present on disk, but it doesn't verify node_modules/node-pty still exists. If node-pty is removed from a staged version — say by a partial uninstall or manual userData cleanup that removes just the native module — this guard returns already-staged, the runtime spawns, and the missing native module causes an immediate crash. Recovery requires going through a full startup failure before markBadVersion is called. Adding existsSync(path.join(versionDir(opts.userData, latest), "node_modules", "node-pty")) to the already-staged condition would let the updater transparently re-stage instead.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/desktop/src/runtime-update.ts
Line: 65-75
Comment:
**`already-staged` integrity check only covers `dist/cli.js`, not `node_modules/node-pty`**
`resolvePointerCliEntry` returns non-null only when `dist/cli.js` is present on disk, but it doesn't verify `node_modules/node-pty` still exists. If `node-pty` is removed from a staged version — say by a partial uninstall or manual `userData` cleanup that removes just the native module — this guard returns `already-staged`, the runtime spawns, and the missing native module causes an immediate crash. Recovery requires going through a full startup failure before `markBadVersion` is called. Adding `existsSync(path.join(versionDir(opts.userData, latest), "node_modules", "node-pty"))` to the `already-staged` condition would let the updater transparently re-stage instead.
How can I resolve this? If you propose a fix, please make it concise.| cleanupPartials(opts.userData); | ||
| const stage = partialDir(opts.userData, latest); | ||
| await rm(stage, { recursive: true, force: true }); | ||
| await mkdir(path.dirname(stage), { recursive: true }); | ||
| await pacote.extract(`${PACKAGE}@${latest}`, stage); |
There was a problem hiding this comment.
cleanupPartials sweeps all in-flight partials globally
cleanupPartials removes every *.partial directory under versions/, not just the one for the current latest. If two code paths (e.g., two renderer windows or a background updater racing with a foreground check) call checkAndStageLatestRuntime concurrently — potentially for different versions — one call's cleanupPartials will delete the other's in-flight partial mid-extract. The second caller would then hit the rm(stage, { force: true }) step on a now-missing path (harmless) but the first caller's pacote.extract would be writing into a directory that was just deleted, producing a corrupt or empty stage. Scoping the sweep to only the current version's partial, or adding a simple lock/flag, would prevent this.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/desktop/src/runtime-update.ts
Line: 77-81
Comment:
**`cleanupPartials` sweeps all in-flight partials globally**
`cleanupPartials` removes every `*.partial` directory under `versions/`, not just the one for the current `latest`. If two code paths (e.g., two renderer windows or a background updater racing with a foreground check) call `checkAndStageLatestRuntime` concurrently — potentially for different versions — one call's `cleanupPartials` will delete the other's in-flight partial mid-extract. The second caller would then hit the `rm(stage, { force: true })` step on a now-missing path (harmless) but the first caller's `pacote.extract` would be writing into a directory that was just deleted, producing a corrupt or empty stage. Scoping the sweep to only the current version's partial, or adding a simple lock/flag, would prevent this.
How can I resolve this? If you propose a fix, please make it concise.| it("recovers from a stale `<v>.partial/` left by a prior interrupted run", async () => { | ||
| mkdirSync(`${versionDir(userData, "1.0.0")}.partial`, { recursive: true }); | ||
| manifestMock.mockResolvedValueOnce({ version: "1.0.0" }); | ||
|
|
||
| const outcome = await checkAndStageLatestRuntime({ | ||
| userData, | ||
| currentVersion: "0.5.0", | ||
| nativeDepsSource, | ||
| }); | ||
|
|
||
| expect(outcome.kind).toBe("staged"); | ||
| expect(readPointer(userData)?.version).toBe("1.0.0"); | ||
| }); |
There was a problem hiding this comment.
Stale-partial recovery test doesn't assert the partial is actually cleaned up
The test creates 1.0.0.partial/, stages, and asserts outcome.kind === "staged" and the pointer is written — but it doesn't verify the stale partial was removed. The main staging test already checks readdirSync(path.dirname(finalDir)).every((n) => !n.endsWith(".partial")), and adding the same assertion here would confirm that cleanupPartials ran correctly and no partial remnants remain after recovery.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/desktop/test/runtime-update.test.ts
Line: 193-205
Comment:
**Stale-partial recovery test doesn't assert the partial is actually cleaned up**
The test creates `1.0.0.partial/`, stages, and asserts `outcome.kind === "staged"` and the pointer is written — but it doesn't verify the stale partial was removed. The main staging test already checks `readdirSync(path.dirname(finalDir)).every((n) => !n.endsWith(".partial"))`, and adding the same assertion here would confirm that `cleanupPartials` ran correctly and no partial remnants remain after recovery.
How can I resolve this? If you propose a fix, please make it concise.
Split from #440.
This PR adds the pure staging function for desktop runtime updates: